American Ninja Warrior - Kaplan-Meier Survival Analyis

Kaplan-Meier
Log Rank Tests
Exploring Survival Analysis using the Kaplan-Meier method with American Ninja Warrior data.
Author
Affiliation

Jonathan Lieb

Baylor University

Published

July 2, 2024

  • This module would be suitable for an in-class lab or take-home assignment in an intermediate statistics course.

  • It assumes a familiarity with the RStudio Environment and R programming language.

  • Students should be provided with the following data file (.csv) and Quarto document (.qmd) to produce visualizations and write up their answers to each exercise. Their final deliverable is to turn in an .html document produced by “Rendering” the .qmd.

  • Posit Cloud (via an Instructor account) or Github classroom are good options for disseminating files to students, but simply uploading files to your university’s course management system works, too.

Welcome video

Introduction

In this module, we will explore American Ninja Warrior data from the 2021 stage 1 finals through the lens of survival analysis. Survival analysis is a statistical method used to analyze the time until an event of interest occurs (often this event is death). In this case, the obstacles will be our time, and the event of interest will be DEATH (we’re calling failing off an obstacle death).

American Ninja Warrior Logo

Imagage Source Wikimedia

One of the simplest methods for survival analysis is the Kaplan-Meier method. This method estimates the survival function, which is the probability that a subject survives past a certain time. The Kaplan-Meier method is non-parametric, meaning it makes no assumptions about the distribution of the survival times. It is a step function that decreases at each event time.

Learning Objectives

By the end of this module you should be able to:

  • Load data into R

  • Calculate Survival Estimates using the Kaplan-Meier method

  • Censor data

  • Make Survival Probability Plots

  • Compare Survival Curves for different groups





NOTE: R is the name of the programming language itself and RStudio is a convenient interface. To throw even more lingo in, you may be accessing RStudio through a web-based version called Posit Cloud. But R is the programming language you are learning :)

During this lab we will explore the following research questions:

  • What is the probability of a competitor surviving past a certain obstacle?
  • How do conditional probabilities of survival change as the competitor progresses through the course?
  • How do the survival probabilities differ based on sex?

Getting started: American Ninja Warrior data

The first step to any analysis in R is to load necessary packages and data.

Running the following code will load the tidyverse, survival, and survminer packages and the anw_2021_stage1 data we will be using in this lab.

TIP: As you follow along in the lab, you should run each corresponding code chunk in your .qmd document. To “Run” a code chunk, you can press the green “Play” button in the top right corner of the code chunk in your .qmd. You can also place your cursor anywhere in the line(s) of code you want to run and press “command + return” (Mac) or “Ctrl + Enter” (Windows).

TIP: If you get an error while attempting to load the library or data, you may need to install the package or check the file path. You can install packages using the install.packages() function in R. Once you have installed a package, you can load it into your R environment using the library() function. If that is not the issue, you may need to check the file path to ensure the data is in the correct location.

# Load packages
library(tidyverse)
library(survival)
library(survminer)

# Load Data
ninja <- read_csv("anw_2021_stage1.csv")

We can use the glimpse() function to get a quick look at our ninja data. The glimpse code provides the number of observations (Rows) and the number of variables (Columns) in the dataset. The “Rows” and “Columns” are referred to as the dimensions of the dataset. It also shows us the names of the variables and the first few observations for each variable.

glimpse(ninja)
Rows: 69
Columns: 5
$ name            <chr> "Meagan Martin", "DeShawn Harris", "Heather Weissinger…
$ sex             <chr> "F", "M", "F", "M", "F", "F", "M", "M", "M", "F", "F",…
$ obstacle        <chr> "Slide Surfer", "Slide Surfer", "Swinging Blades", "Sw…
$ obstacle_number <dbl> 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, …
$ cause           <chr> "Fall", "Fall", "Fall", "Fall", "Fall", "Fall", "Fall"…

Another useful function to get a quick look at the data is the head() function. This function shows the first few rows of the dataset.

head(ninja)
# A tibble: 6 × 5
  name               sex   obstacle        obstacle_number cause
  <chr>              <chr> <chr>                     <dbl> <chr>
1 Meagan Martin      F     Slide Surfer                  1 Fall 
2 DeShawn Harris     M     Slide Surfer                  1 Fall 
3 Heather Weissinger F     Swinging Blades               2 Fall 
4 Cam Baumgartner    M     Swinging Blades               2 Fall 
5 Brittney Durant    F     Swinging Blades               2 Fall 
6 Casey Rothschild   F     Swinging Blades               2 Fall 
Exercise 1: Data Structure
  1. How many observations are in the dataset?

  2. How many variables are in the dataset?

  3. Which variable should be used as the time variable in the survival analysis?


TIP: Type your answers to each exercise in the .qmd document.

Terms to know

Before proceeding with the survival analysis, let’s make sure we understand American Ninja Warrior and some of it’s vocabulary to help us climb our way through this lab.

How does American Ninja Warrior work?

American Ninja Warrior is an NBC competition show where competitors attempt to complete a series of obstacle courses of increasing difficulty. In a single obstacle course, the competitors must complete a series of obstacles in a row. If they fail an obstacle, they are eliminated from the competition. The competitors also have a time limit to complete the course. The competitors are ranked based on how far they get in the course and how quickly they complete it.

Most of the obstacles are designed to test the competitors’ upper body strength. Some obstacles require balance and agility though.

The warped wall is arguably the most famous, although now least difficult, obstacle on an American Ninja Warrior course. The warped wall is a curved wall that competitors must run up and grab the top of. The warped wall is on every course and is often the final obstacle, although this is not the case on the Finals courses.

The warped wall which was previously 14 feet and is now 14.5 feet tall. They have even had a 18 foot warped wall on the show.

Warped Wall

Image Source: SB Nation

The obstacles in American Ninja Warrior are all given names. For example, the famed Warped Wall is a curved wall that competitors must run up and grab the top of. The Salmon Ladder is a series of rungs that competitors must move up by jumping and pulling themselves up.

Watch Enzo Wilson complete the American Ninja Warrior Course at the 2021 Finals Stage 1 (Season 13) in the video below.

Key Terms
  • Obstacle: A challenge that competitors must complete to move on in the competition.

  • Course: A series of obstacles that competitors must complete in a row. A typical course has 6-10 obstacles.

  • Stage: A round of the competition. The competition starts with Stage 1 and progresses to Stage 4.

  • Time Limit: The amount of time competitors have to complete the course, often between 2-4 minutes.

Variable Descriptions

The ninja data you’ll be analyzing in this lab provides the individual run information for each ninja in the 2021 Finals Stage 1 (Season 13). The data includes the ninja’s name, their sex, the obstacle they failed on, and the cause of that failure.

Variable Descriptions
Variable Description
name Name of the American Ninja Warrior
sex Sex of the American Ninja Warrior (M/F)
obstacle The name of the obstacle the ninja failed on
obstacle_number The obstacles’ place in the run
cause What caused the ninja to fail (Fall/Time/Complete)

Points of Confusion

Use the following code to create a table of the obstacle names and their corresponding obstacle numbers. This will help you understand the order of the obstacles in the course.

ninja |> 
  distinct(obstacle, obstacle_number)
# A tibble: 10 × 2
   obstacle        obstacle_number
   <chr>                     <dbl>
 1 Slide Surfer                  1
 2 Swinging Blades               2
 3 Double Dipper                 3
 4 Jumping Spider                4
 5 Tire Run                      5
 6 Dipping Birds                 7
 7 The High Road                 8
 8 Fly Hooks                     8
 9 Cargo Net                     9
10 Complete                     10

It can be seen quickly that there is no obstacle number 6 in the data and that there are 2 obstacles with number 8. Both of these appear confusing at first, but they have logical explanations.

The missing obstacle number 6 is due to the fact that no ninja failed on obstacle 6. Some quick research tells us that obstacle 6 was the warped wall.

Read about the 2021 Stage 1 Split Decision here.

The duplicate obstacle number 8 is due to the fact that Stage One of the 2021 Finals allowed a split-decision. This means that competitors could choose between two different obstacles for obstacle 8.

Another important things to note are that obstacle 10 is not really an obstacle, but rather the end of the course.

Lastly, one competitor Joe Moravsky ran the course twice (falling the first time and completing it the second time). This is because he was the Safety Pass Winner from the previous round. The Safety Pass allows a competitor to run the course again if they fail the first time.

Kaplan-Meier Survival Analysis

Censored Data

In survival analysis, we often have need to censor data. Censored data occurs when the event of interest has not occurred for some of the observations. In our case, the event of interest is the failure of a competitor on an obstacle. If a competitor completes the course, we do not know which obstacle they would have failed on. Additionally if the time limit is reached, we do not know which obstacle the competitor would have failed on, although we do know that they didn’t fail on any of the ones they completed before time was up. Both of these situations are examples of censored data.

Use the code below to create a new column called censor in the ninja data that is a binary indicator of whether or not the observation should be censored. This column will be used to indicate whether the data is censored or not.

# Makes a column called censor that is 1 if the competitor failed and 0 if they completed the course or ran out of time
ninja <- ninja |> 
  mutate(censor = if_else(cause %in%  c("Complete", "Time"), 0, 1))

Calculating Survival Probabilities

The Kaplan-Meier estimator uses information from all of the observations in the data and considers survival to a certain point in time as a series of steps defined at the observed survival times.

In order to calculate the probability of surviving past a certain point in time (past a certain obstacle in this case), the conditional probability of surviving past that point given that the competitor has survived up to that point must be calculated first.

The formula for conditional probability of surviving past a point in time (\(t_i\)) given that the competitor has survived up to that point in time(\(t_{i-1}\)) is:

Note: This function could also be written as \(P(T > t_i | T \geq t_{i-1}) = \frac{n_i - d_i}{n_i}\)

\(P(T \geq t_i | T \geq t_{i-1}) = 1- \frac{d_i}{n_i}\)

Where:

\(d_i\) is the number of competitors that failed at time \(t_i\)

\(n_i\) is the number of competitors that were at risk at time \(t_i\)

The Kaplan-Meier estimator is the product of the conditional probabilities of surviving past each point in time up through that point in time.

\(\hat{S}(t) = \prod_{t_i \leq t} (1 - \frac{d_i}{n_i})\)

Note: Censored data does not count in the at risk competitors

where \(n_i = n_{i-1} - d_{i-1} - c_{i-1}\)

For example if we have this data:

# Setting a seed for reproducibility
set.seed(123)

# Creating fake data
fake_data <- tibble(obstacle_number = c(1:5, 4:5), censor = c(rep(1, 5), rep(0, 2))) |> 
  sample_n(25, replace = TRUE)

head(fake_data)
# A tibble: 6 × 2
  obstacle_number censor
            <int>  <dbl>
1               5      0
2               5      0
3               3      1
4               4      0
5               3      1
6               2      1

And we wanted to calculate the Kaplan-Meier estimate of surviving past obstacle 2 we would need to find the following probabilities:

\(P(T > 1 | T > 0) = P(T > 1) = 1 - \frac{\text{number of competitors that failed at obstacle 1}}{\text{number of competitors that attempted obstacle 1}}\)

\(P(T > 2 | T > 1) = 1 - \frac{\text{number of competitors that failed at obstacle 2}}{\text{number of competitors that attempted obstacle 2}}\)

The following code calculates the first two probabilities:

fake_data_summary <- fake_data |> 
  filter(obstacle_number <= 2) |> 
  group_by(obstacle_number) |> 
  summarize(fails = sum(censor == 1)) |> 
  ungroup() |>
  mutate(at_risk = 25 - cumsum(fails),
         cond_prob = 1 - fails/at_risk)

fake_data_summary
# A tibble: 2 × 4
  obstacle_number fails at_risk cond_prob
            <int> <int>   <dbl>     <dbl>
1               1     4      21     0.810
2               2     3      18     0.833

We would then need to multiply these two probabilities together to get the Kaplan-Meier estimate of surviving past obstacle 2.

\(P(T > 2) = P(T > 1) * P(T > 2 | T > 1)\)

The following code calculates the Kaplan-Meier estimate of surviving past obstacle 2:

kaplan_meier <- fake_data_summary |> 
  summarize(kaplan_meier = prod(cond_prob))

The Kaplan-Meier estimate of surviving past obstacle 2 in this fake example is 0.6746032.

Kaplan-Meier Estimator Manual Calculation

The ninja data frame contains information about individual competitors in the ninja competition. We will need to summarize the data to calculate the Kaplan-Meier estimator manually.

Exercise 2: Manual Calculation of Kaplan-Meier Estimator

In this exercise you will calculate the Kaplan-Meier estimator of surviving past each obstacle in the ninja competition step by step.

Part 1: Number of Events

The first step is to calculate the number of competitors that failed and the number of competitors that were censored at each point in time. These are the \(d_i\) and \(c_i\) values needed to calculate the conditional probability of surviving past each point in time.

Use the following code sum the number of competitors that failed and the number of competitors that were censored at each obstacle.

ninja_summary <- ninja |> 
  group_by(obstacle = obstacle_number) |>
  summarize(fails = sum(cause == "Fall"),
            censor = sum(cause %in% c("Complete", "Time")))
  1. At which obstacle did the most competitors fail?

  2. At which obstacle were the most competitors censored?

Part 2: At Risk Competitors

The second step is to calculate the number of competitors at risk at each point in time. This is the \(n_i\) value needed to calculate the conditional probability of surviving past each point in time.

Use the following code to calculate the number of competitors at risk at each point in time from the ninja_summary data frame.

ninja_summary <- ninja_summary |> 
  mutate(attempts = 69 - lag(cumsum(fails), default = 0) - lag(cumsum(censor), default = 0))
  1. Which obstacle had the most competitors at risk?

  2. Why don’t censored competitors contribute to the number of competitors at risk at the obstacle?

Part 3: Conditional Fail Probability

The third step is to calculate the conditional probability of surviving past each point in time. This is the \(P(T \geq t_i | T \geq t_{i-1})\) value needed to calculate the Kaplan-Meier estimator and is calculated as \(1 - \frac{d_i}{n_i}\).

  1. Use the ninja_summary data frame to calculate the probability that someone fails each obstacle. Do this using the mutate function to create a new column called fail_prob. Fail probability is the number of competitors that failed divided by the number of competitors at risk. Save this data frame as ninja_summary.
  1. What percentage of at-risk competitors failed the first obstacle?

  2. What percentage of at-risk competitors failed the fifth obstacle?

  3. Which obstacle had the highest fail probability?

  4. Did obstacle 2 or obstacle 7 have a higher fail rate?

Part 4: Kaplan-Meier

The final step is to calculate the Kaplan-Meier estimator of surviving past each point in time (\(\hat{S}(t)\)) This is calculated as the product of the conditional probabilities of surviving past each point in time up through the desired point in time (\(\prod_{i=1}^{t} P(T \geq t_i | T \geq t_{i-1})\)).

Use the following code to calculate the Kaplan-Meier estimator manually by multiplying the conditional probabilities of surviving past each point in time up through the desired point in time.

ninja_summary <- ninja_summary |> 
  mutate(km = cumprod(1 - fail_prob))
  1. What is the Kaplan-Meier estimate for surviving past the first obstacle?

  2. What is the Kaplan-Meier estimate for surviving past first five obstacles?

  3. What is the farthest obstacle that for which the Kaplan-Meier estimator has more 50% of competitors surviving?

Plotting the Kaplan-Meier Estimator

We will now use ggplot2 to plot the Kaplan-Meier estimator for the ninja competitors. The Kaplan-Meier estimator is a step function, so we will use geom_step to plot the estimator. We will also use geom_point to plot the points where the estimator changes.

Part 5: Plotting the Kaplan-Meier Estimator
  1. Use the ninja_summary data frame fram in conjunction with ggplot2’s geom_step and geom_point to plot the Kaplan-Meier estimator for the ninja competitors.

  2. Comment on the plot.

  3. What do you notice about where the lowest point on the plot in regard to survival probability? Does survival probability reach zero? Why or why not?

TIP: You can pipe the data frame into the mutate function to create a new column.

TIP: The mutate function works like so: data_frame |> mutate(new_column = calculation)

Type ?geom_step, ?geom_point, or ?ggplot in the console to learn more about these functions.

TIP: Remember that you can add the + operator to continue adding layers to the plot like seen below

ggplot(your_data, aes(x = time_var, y = kaplan_meier_var)) +
  geom_step() +
  geom_point()

TIP: You can also add labels to the plot using the labs function like seen below

ggplot(your_data, aes(x = time_var, y = kaplan_meier_var)) +
  geom_step() +
  geom_point() +
  labs(title = "Your Title",
       x = "X Axis Label",
       y = "Y Axis Label")

Using R to Calculate the Kaplan-Meier Estimator

Phew! That was a lot of tedious work to calculate and plot the Kaplan-Meier estimator manually. Luckily, there is a much easier way to calculate the Kaplan-Meier estimator using R.

The survival package in R provides a function called survfit that can be used to calculate the Kaplan-Meier estimator. The survfit function requires a Surv object as input. The Surv object is created using the Surv function, which requires two arguments:

  1. The time to event data. The time to event data is the time at which the event occurred or the time at which the individual was censored. In our case this is the obstacle_number in our ninja data.

  2. The event status. The event status is a binary variable that indicates whether the event occurred or the individual was censored. The event status is coded as 1 if the event occurred and 0 if the individual was censored. This is contained in the censor column of the ninja data.

Below a survfit model is created for the ninja.

ninja_km <- survfit(Surv(obstacle_number, censor) ~ 1, data = ninja)
Exercise 3: Kaplan-Meier Estimates and Interpretation

Use summary(ninja_km) to view a summary of the Kaplan-Meier estimator.

  1. Do the values in the survival column match the values you calculated manually?

The output also shows the 95% confidence intervals.

  1. Which obstacle number is the first point in time where a survival rate of less than .5 falls within the 95% confidence interval?

  2. What do you notice about the standard error as the time increases?

The computations for calculating the Confidence Interval for the K-M Estimate are fairly complex. The method most commonly used is called the log-log survival function and was proposed by Kalbfleisch and Prentice (2002). This function is computed by \(ln(-ln[\hat{S}(t)])\) with variance derived from the delta method and calculated by \[ \frac{1}{[ln(\hat{S}(t))]^2}\sum_{t_i\leq{t}}\frac{d_i}{n_i(n_i - d_i)} \].

The endpoints for the confidence interval for the log-log survival function are therefore found by \(ln(-ln[\hat{S}(t)]) \pm Z_{1-\alpha / 2} SE [ln(-ln[\hat{S}(t)]) ]\)

And the endpoints expressed by the computer and seen in the summary are \(exp[-exp(\hat{c}_u)] \text{ and } exp[-exp(\hat{c}_l)]\)

Quartile Interpretation

The three quartiles are common statistics to look at when doing a survival analysis. The interpretations of these are as follows:

Note: If the data is uncensored the estimate is just the median of the data. If the data is censored, the KM estimate is used to find these by finding the time at which it drops below the percentile

  1. 25th Percentile- 75% of the people survive past this point in time

  2. Median- 50% of the people will survive past this time

  3. 75th Percentile- 25% survive past this time

Exercise 4: Interpreting Quartiles

Use the results from quantile(ninja_km) to answer the following questions

  1. At what time does the lower CI endpoint imply 75% of the competitors survive past? What time does the upper CI endpoint?

  2. What is the interpretation of the NA values in the 75 percentile columns?

  3. What is the earliest time (within the 95% confidence interval) the at which 50% of the competitors could suffer a fall by?

Plotting with R

After fitting a Kaplan-Meier model, we can use the ggsurvplot function from the survminer package to plot the Kaplan-Meier estimator. The ggsurvplot function requires the Kaplan-Meier model as input.

Below is an example of how easy it is to plot the Kaplan-Meier estimator using R.

ggsurvplot(ninja_km,
           conf.int = TRUE)

Kaplan-Meier Estimator by Groups

Sometimes we may want to compare the survival probabilities of different groups of individuals. For example, we may want to compare the survival probabilities base on age, gender, treatment group, or a variety of other factors.

In our data set, we have a variable called sex that indicates the gender of the competitor. We can use this variable to compare the survival probabilities males and females.

The Kaplan-Meier estimator can be calculated for different groups of individuals quite easily in r. We simply have to add the variable sex to our function in the model to do this.

ninja_km_gender <- survfit(Surv(obstacle_number, 
                                censor)~ sex, data = ninja)

A plot can be created to help us compare these groups.

Fun Fact: Jesse Labreck was the only woman to complete the 2021 American Ninja Warrior Stage 1 Finals Course.

You can watch her incredible run below

ggsurvplot(ninja_km_gender,
           conf.int = TRUE)

Exercise 5: Kaplan-Meier Estimates by Group
  1. Which sex group has a higher survival probability for most of the time points?

  2. Which sex has a wider confidence interval throughout the time points?

  3. Explain why your answer to b might be the case.

Checking the differences between the groups

One way of checking the differences between the groups is to use the survdiff function in the survival package. The survdiff function can be used to test the null hypothesis that the survival probabilities of the two groups are equal. The survdiff function requires a Surv object as input.

Note: The log-rank test is a non-parametric test. This means that it does not assume that the data is normally distributed.

The test that the survdiff function performs is a log-rank test. The formula for the test statistic of the log-rank test is:

\[X^2 = \sum_{j=1}^{k} \frac{(O_j - E_j)^2}{V_j} \]

where

  • \(O_j\) is the observed number of events in group \(j\)
  • \(E_j\) is the expected number of events in group \(j\)
  • \(V_j\) is the variance of \(O_j - E_j\)
  • \(k\) is the number of groups
ninja_km_diff <- survdiff(Surv(obstacle_number,
                               censor) ~ sex, data = ninja)
Exercise 6: Comparing Survival vs. Expected

Use the code below to see the results of the survdiff function

ninja_km_diff
Call:
survdiff(formula = Surv(obstacle_number, censor) ~ sex, data = ninja)

       N Observed Expected (O-E)^2/E (O-E)^2/V
sex=F 12       10     4.11      8.44      10.7
sex=M 57       27    32.89      1.05      10.7

 Chisq= 10.7  on 1 degrees of freedom, p= 0.001 
  1. How many female competitors are in the data set? How many fell? How many were expected to fall (round to the nearest whole number)?

  2. Did more or less male competitors fall than expected?

  3. What is the p-value of the test? What does this mean?

Other Log Rank Based Tests

Although the survdiff function uses the most common log-rank test, there are a variety of different methods for comparing survival between two groups.

These methods are all similar to a standard log-rank test but weight time points in order to detect differences better. Other common methods include:

  • Gehan-Breslow method: A generalized Wilcoxon test
  • The Tarone-Ware method: Detects early differences
  • The Peto-Prentice method: Uses weighted time points based on the pooled survival estimates. More robust than TW or GB. Detects early differences.
  • The Fleming-Harrington method, which is another weighted version of the log-rank test.

The surv_pvalue function in the survival package can be used to calculate the p-value for all of these tests by changing the method argument.

Exercise 7: Log-Rank Tests

Run the code below to see the p-values for the different methods.

surv_pvalue(ninja_km_gender) #survdiff
surv_pvalue(ninja_km_gender, method = "n") #Gehan Breslow (generalized Wilcoxon)
surv_pvalue(ninja_km_gender, method = "TW") #tarone-ware
surv_pvalue(ninja_km_gender, method = "PP") #Peto-Prentice
surv_pvalue(ninja_km_gender, method = "FH") #Fleming-Harrington
  1. Using \(\alpha = 0.05\), do all of the tests lead to the same conclusion? If so what is the conclusion? If not which ones agree and which ones do not?

  2. Which test is the most conservative?

More Practice

Read in the anw_2023_stage1 data to complete the following problems.

Do It Yourself
Challenge 1: Cleaning Data
  1. There is one ninja in the dataset who did a second run. Find and remove this second run from the data.

  2. Why should this run be removed? (Hint: Think knowledge gained)

  3. Add a variable called censor to the dataset that is a 0 if time ran out or the ninja completed the course, but is a 1 otherwise.

Challenge 2: Fitting KM
  1. Use the survfit function to find the Kaplan-Meier estimates

  2. How many competitors were at risk at the fifth obstacle?

  3. What is the survival estimate that a ninja survives past the seventh obstacle?

  4. What is the median number of obstacles that a ninja survives on the course?

Challenge 3: Comparing Groups
  1. Fit a survdiff model to compare the survival of males and females as well as another survfit model to find the Kaplan-Meier estimates for each group

  2. What is the test-static value of the log-rank test?

  3. What is your conclusion about the survival of males and females. Why?

  4. Make a plot of the Kaplan-Meier estimates, showing the confidence intervals for each group.

  5. Is it more likely that a female survives past the fourth obstacle or that a male survives past all eight obstacles?

  6. Assume that the null hypothesis is that the true survival rate of surviving the first 4 obstacles is equal for males and females. The alternative hypothesis is that the true survival rates are different. Based on 95% confidence intervals, is it correct to say that the true mean of survival rate for males to survive past the fourth obstacle is different than the true mean of for females to survive past the fourth obstacle? Explain your answer.

The winner of the 2023 American Ninja Warrior competition was Vance Walker. Walker won $1 million for his victory.

Vance Walker

Image Source: NBC

Fun Fact: Thread the Needle, the 8th obstacle on the 2021 Finals Stage 1 course, saw the highest failure rate of any obstacle. It had not previously been used on the Finals Stage 1 courses.

Thread the Needle Image Source: Sasukepedia

Click here to read more about Edward Kaplan and Paul Meier’s work and how it has impacted the field of Statistics, particularly in regards to biostatistics.